Search CORE

79 research outputs found

Syntax-Aware Multi-Sense Word Embeddings for Deep Compositional Models of Meaning

Author: Cheng Jianpeng
Kartsaklis Dimitri
Publication venue
Publication date: 01/01/2015
Field of study

Deep compositional models of meaning acting on distributional representations of words in order to produce vectors of larger text constituents are evolving to a popular area of NLP research. We detail a compositional distributional framework based on a rich form of word embeddings that aims at facilitating the interactions between words in the context of a sentence. Embeddings and composition layers are jointly learned against a generic objective that enhances the vectors with syntactic information from the surrounding context. Furthermore, each word is associated with a number of senses, the most plausible of which is selected dynamically during the composition process. We evaluate the produced vectors qualitatively and quantitatively with positive results. At the sentence level, the effectiveness of the framework is demonstrated on the MSRPar task, for which we report results within the state-of-the-art range.Comment: Accepted for presentation at EMNLP 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

Enhanced sampling in generalized ensemble with large gap of sampling parameter: case study in temperature space random walk

Author: Ma Jianpeng
Zhang Cheng
Publication venue: 'AIP Publishing'
Publication date: 12/03/2009
Field of study

We present an efficient sampling method for computing a partition function and accelerating configuration sampling. The method performs a random walk in the

\lambda

space, with

\lambda

being any thermodynamic variable that characterizes a canonical ensemble such as the reciprocal temperature

\beta

or any variable that the Hamiltonian explicitly depends on. The partition function is determined by minimizing the difference of the thermal conjugates of

\lambda

(the energy in the case of

\lambda=\beta

), defined as the difference between the value from the dynamically updated derivatives of the partition function and the value directly measured from simulation. Higher-order derivatives of the partition function are included to enhance the Brownian motion in the

\lambda

space. The method is much less sensitive to the system size, and the size of

\lambda

window than other methods. On the two dimensional Ising model, it is shown that the method asymptotically converges the partition function, and the error of the logarithm of the partition function is much smaller than the algorithm using the Wang-Landau recursive scheme. The method is also applied to off-lattice model proteins, the

AB

models, in which cases many low energy states are found in different models.Comment: 7 pages, 3 figure

arXiv.org e-Print Archive

Crossref

PubMed Central

Estimating statistical distributions using an integral identity

Author: Cheng Zhang
Frenkel D.
Jianpeng Ma
Press W. H.
Publication venue: 'AIP Publishing'
Publication date: 01/01/2012
Field of study

We present an identity for an unbiased estimate of a general statistical distribution. The identity computes the distribution density from dividing a histogram sum over a local window by a correction factor from a mean-force integral, and the mean force can be evaluated as a configuration average. We show that the optimal window size is roughly the inverse of the local mean-force fluctuation. The new identity offers a more robust and precise estimate than a previous one by Adib and Jarzynski [J. Chem. Phys. 122, 014114, (2005)]. It also allows a straightforward generalization to an arbitrary ensemble and a joint distribution of multiple variables. Particularly we derive a mean-force enhanced version of the weighted histogram analysis method (WHAM). The method can be used to improve distributions computed from molecular simulations. We illustrate the use in computing a potential energy distribution, a volume distribution in a constant-pressure ensemble, a radial distribution function and a joint distribution of amino acid backbone dihedral angles.Comment: 45 pages, 7 figures, simplified derivation, a more general mean-force formula, add discussions to the window size, add extensions to WHAM, and 2d distribution

arXiv.org e-Print Archive

Crossref

PubMed Central

DSpace at Rice University

Investigating the Role of Prior Disambiguation in Deep-learning Compositional Models of Meaning

Author: Cheng Jianpeng
Grefenstette Edward
Kartsaklis Dimitri
Publication venue
Publication date: 15/11/2014
Field of study

This paper aims to explore the effect of prior disambiguation on neural network- based compositional models, with the hope that better semantic representations for text compounds can be produced. We disambiguate the input word vectors before they are fed into a compositional deep net. A series of evaluations shows the positive effect of prior disambiguation for such deep models.Comment: NIPS 201

arXiv.org e-Print Archive

Oxford University Research Archive

Lifecycle of neural semantic parsing

Author: Cheng Jianpeng
Publication venue: The University of Edinburgh
Publication date: 01/07/2019
Field of study

Humans are born with the ability to learn to perceive, comprehend and communicate with language. Computing machines, on the other hand, only understand programming languages. To bridge the gap between humans and computers, deep semantic parsers convert natural language utterances into machine-understandable logical forms. The technique has a wide range of applications ranging from spoken dialogue systems and natural language interfaces. This thesis focuses on neural network-based semantic parsing. Traditional semantic parsers function with a domain-specific grammar that pairs utterances and logical forms, and parse with a CKY-like algorithm in polynomial time. Recent advances in neural semantic parsing reformulate the task as a sequence-to- sequence learning problem. Neural semantic parsers parse a sentence in linear time, and reduce the need for domain-specific assumptions, grammar learning, and extensive feature engineering. But this modeling flexibility comes at a cost since it is no longer possible to interpret how meaning composition is performed, given that logical forms are structured objects (trees or graphs). Such knowledge plays a critical role in understanding modeling limitations so as to build better semantic parsers. Moreover, the sequence-to-sequence learning problem is fairly unconstrained, both in terms of the possible derivations to consider and in terms of the target logical forms which can be ill-formed or unexecutable. The first contribution of this thesis is an improved neural semantic parser, which produces syntactically valid logical forms following a transition system and grammar constrains. The transition system integrates the generation of domain-general (i.e., valid tree-structures and language-specific predicates) and domain-specific aspects (i.e., domain-specific predicates and entities) in a unified way. The model employs various neural attention mechanisms to handle mismatches between natural language and formal language—a central challenge in semantic parsing. Training data to semantic parsers typically consists of utterances paired with logical forms. Another challenge of semantic parsing concerns the annotation of logical forms, which is labor-intensive. To write down the correct logical form of an utterance, one not only needs to have expertise in the semantic formalism, but also has to ensure the logical form matches the utterance semantics. We tackle this challenge in two ways. On the one hand, we extend the neural semantic parser to a weakly-supervised setting within a parser-ranker framework. The weakly-supervised setup uses training data of utterance-denotation (e.g., question-answer) pairs, which are much easier to obtain and therefore allow to scale semantic parsers to complex domains. Our framework combines the advantages of conventional weakly-supervised semantic parsers and neural semantic parsing. Candidate logical forms are generated by a neural decoder and subsequently scored by a ranking component. We present methods to efficiently search for candidate logical forms which involve spurious ambiguity—some logical forms do not match utterance semantics but coincidentally execute to the correct denotation. They should be excluded from training. On the other hand, we focus on how to quickly engineer a practical neural semantic parser for closed domains, by directly reducing the annotation difficulty of utterance-logical form pairs. We develop an interface for efficiently collecting compositional utterance-logical form pairs and then leverage the data collection method to train neural semantic parsers. Our method provides an end-to-end solution for closed-domain semantic parsing given only an ontology. We also extend the end-to-end solution to handle sequential utterances simulating a non-interactive user session. Specifically, the data collection interface is modified to collect utterance sequences which exhibit various co-reference patterns. Then the neural semantic parser is extended to parse context-dependent utterances. In summary, this thesis covers the lifecycle of designing a neural semantic parser: from model design (i.e., how to model a neural semantic parser with an appropriate inductive bias), training (i.e., how to perform fully supervised and weakly supervised training for a neural semantic parser) to engineering (i.e., how to build a neural semantic parser from a domain ontology)

Edinburgh Research Archive

Counting Solutions for the N-queens and Latin Square Problems by Efficient Monte Carlo Simulations

Author: Cheng Zhang
Jianpeng Ma
K. Hukushima
Publication venue: 'American Physical Society (APS)'
Publication date: 28/08/2008
Field of study

We apply Monte Carlo simulations to count the numbers of solutions of two well-known combinatorial problems: the N-queens problem and Latin square problem. The original system is first converted to a general thermodynamic system, from which the number of solutions of the original system is obtained by using the method of computing the partition function. Collective moves are used to further accelerate sampling: swap moves are used in the N-queens problem and a cluster algorithm is developed for the Latin squares. The method can handle systems of

10^4

degrees of freedom with more than

10^10000

solutions. We also observe a distinct finite size effect of the Latin square system: its heat capacity gradually develops a second maximum as the size increases.Comment: 10 pages, 4 figure

arXiv.org e-Print Archive

Crossref

Neural Summarization by Extracting Sentences and Words

Author: Cheng Jianpeng
Lapata Maria
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2016
Field of study

Traditional approaches to extractive summarization rely heavily on human-engineered features. In this work we propose a data-driven approach based on neural networks and continuous sentence features. We develop a general framework for single-document summarization composed of a hierarchical document encoder and an attention-based extractor. This architecture allows us to develop different classes of summarization models which can extract sentences or words. We train our models on large scale corpora containing hundreds of thousands of document-summary pairs. Experimental results on two summarization datasets demonstrate that our models obtain results comparable to the state of the art without any access to linguistic annotation.Comment: ACL2016 conference paper with appendi

arXiv.org e-Print Archive

Crossref

Edinburgh Research Explorer

Weakly-supervised Neural Semantic Parsing with a Generative Ranker

Author: Cheng Jianpeng
Lapata Maria
Publication venue
Publication date: 01/01/2018
Field of study

Weakly-supervised semantic parsers are trained on utterance-denotation pairs, treating logical forms as latent. The task is challenging due to the large search space and spuriousness of logical forms. In this paper we introduce a neural parser-ranker system for weakly-supervised semantic parsing. The parser generates candidate tree-structured logical forms from utterances using clues of denotations. These candidates are then ranked based on two criterion: their likelihood of executing to the correct denotation, and their agreement with the utterance semantics. We present a scheduled training procedure to balance the contribution of the two objectives. Furthermore, we propose to use a neurally encoded lexicon to inject prior domain knowledge to the model. Experiments on three Freebase datasets demonstrate the effectiveness of our semantic parser, achieving results within the state-of-the-art range.Comment: In EMNLP-CoNLL 201

arXiv.org e-Print Archive

Crossref

Edinburgh Research Explorer